# A DETAIL REVIEW OF A FAST DATA THROUGHPUT USING VARIOUS STUDIES

Mr.P.Balakrishna<sup>1</sup>, Dr.Julian Savari Antony<sup>2</sup>

<sup>1</sup>Research Scholar, Shri JJT University, Churela, Rajasthan. <sup>2</sup>Guide, Shri JJT University, Churela, Rajasthan.

#### **Abstract:**

The Mobile system is required to help essentially huge measure of versatile information traffic and colossal number of remote associations, to accomplish better range and vitality productivity, just as nature of administration, unwavering quality and security. Moreover, the updated versatile system will likewise consolidate high portability prerequisites as an essential part, offering good support to clients going at a speed. This paper gives an overview of potential high versatility remote correspondence strategies for portable system. In the wake of talking about the run of the mill prerequisites and difficulties, key procedures to adapt to the difficulties are checked on, including transmission strategies under the quick time-shifting channels, organize design with portability backing, and versatility the executives. At long last, future research headings on high portability interchanges are given.

**Keywords:** FEC-Forward Error Correcting, AVAadaptive Viterbi algorithm, BM- Branch Metric

#### 1. Introduction:

In the field of, wireless communication has always been the most vibrant area as it often confronts profound challenges. Such high-speed data transmission over wireless networks, delivering high-definition audio and video, improving voice quality, and expanding broadband data services.

Evolution of such wireless communication technologies from second generation (2G) to till-date fourth generation (4G) has seen a surge in the rate of data transmission and it has been predicted to reach beyond 3Gbps for the next generation wireless communication standards. Thereby, each communication block associated with the physical layer of wireless communication system must process data at this rate. Channel decoder is an integral part of wireless communication system and is responsible for reliable data communication.

A channel decoder which employs turbo codes for errorcorrection delivers excellent bit-error-rate performance and it has made this code widely accepted by various wireless communication standards. Peak data-rates of 3G and 4G wireless communication standards which include turbo codes for error correction.

Turbo codes are one of the most powerful types of Forward Error-Correcting (FEC) channel codes. Since the emergence of digital communication systems, there has been a need for error correction. This is due to the nonideal nature of practical communication channels, which are often corrupted by noise. Error correction attempts to compensate for the errors introduced by this noise.

The advantages of forward error correction are that a backchannel is not required and retransmission of data can often be avoided (at the cost of higher bandwidth requirements, on average). FEC is therefore applied in situations where retransmissions are relatively costly or impossible.

Turbo codes have been first introduced in 1993 By Berrou, Gavieux and Thitimajshima, and provide near optimal performance approaching the Shannon limit. The channel coding scheme for Long Term Evolution (LTE) is Turbo coding. The Turbo decoder is typically one of the major blocks in a LTE wireless receiver. Turbo decoders suffer from high decoding latency due to the iterative decoding process, the forward backward recursion in the maximum a posteriori (MAP) decoding algorithm and the interleaving and de-interleaving between iterations

#### 2. Literature Review

[1] Combined BIP, a high throughput technique, with the look ahead computation and constructed BIPMAP decoder architecture which provides a throughput gain of 1.96 at the cost of 63% area overhead. Compared to the parallel architecture, the BIP architecture provided the same speed-up with a reduction in logic complexity by a factor of M, where M is the level of parallelism. The symbol-based

ISSN: 2320 – 8791 (Impact Factor: 2.317)

## www.ijreat.org

architecture provided a speed-up in the range from 1 to 2 with a logic complexity that grows exponentially with and a state metric storage requirement that was reduced by a factor of M as compared to a parallel architecture. The symbol-based BIP architecture provided speed-up in the range to 2 with an exponentially higher logic complexity and a reduced memory complexity compared to a parallel architecture. They also provided a detailed comparison of various high-throughput VLSI architectures in terms of their storage requirements, logic complexity, throughput gain, and silicon area. These analyses were validated via physical design examples for turbo decoders of parallel concatenated convolutional codes (PCCC) and turbo equalizers. Reducing memory access is crucial to achieve low power design in the memory intensive algorithm like turbo decoding.

[2] described a power-efficient implementation of an adaptive Viterbi decoder. To measure its power consumption, the adaptive Viterbi algorithm (AVA) architecture has been implemented in two contemporary FPGA architectures for a range of constraint lengths. For a given fixed BER and decode rate, power savings was achieved by adapting the constraint length of the convolutional code employed, with the goal of employing a lower power decoder when allowable. The dynamically reconfigurable FPGA implementation was shown to consume significantly less power than a static FPGA implementation.

[3] gave memory-reduced MAP decoding for doublebinary convolutional turbo code. They partitioned the metric (BM) without introducing branch any computational overhead and also extrinsic metrics were independent from a posteriori LLR to minimize the memory elements. Decompose the BM as the sum of Information Metric (IMs) and Parity Metric (PAMs). Due to this MAP decoder was reformulated as functions of IMs and PAMs. They presented a memory-reduced VLSI architecture for the decoding of Double-Binary Convolutional Turbo Code (DB CTC) using Maximum A Posteriori Probability (MAP) algorithm. For such kind of Soft-In Soft-Out (SISO) decoding, the Branch Metrics (BMs) became the dominant factor in determining the overall required memory size inside the SISO decoder. They proposed to decompose each BM into information metric and a parity metric, which leads to 50% reduction of the memory size for BMs. Modified the MAP algorithm based on the new formulation of BMs.

[4] Memory arrangements in turbo decoders using slidingwindow BCJR algorithm is demonstrated. Turbo coding is a powerful encoding and decoding technique that can provide highly reliable data transmission at extremely low signal-to-noise ratios. According to the computational complexity of the employed decoding algorithm, the realization of turbo decoders usually takes a large amount of memory spaces and potentially long decoding delay. They focused on the development of general formulas for efficient memory management of turbo decoders employing the sliding-window BCJR algorithm. Three simple but general results were presented to evaluate the required memory size, throughput rate, and latency based on the speed and the number of adopted processors. The results thus provided useful and general information on practical implementations of turbo decoders.

[5] conferred hardware implementation of Max-Log-MAP algorithm based on MacLaurin series for turbo decoder. Generally hardware implementation of MAP decoder was quite complex. The original MAP algorithm suffers from serious drawbacks in its hardware implementation. To overcome this disadvantage, Max-Log-MAP and Log-MAP algorithms had been proposed to reduce the complexity. An improved Max-Log-MAP algorithm was proposed by Shahram et al based on MacLaurin series to further reduced the complexity. They had proposed hardware architecture for modified Max- Log-MAP algorithm using MacLaurin series. By applying MacLaurin series and replaced all the multipliers with shifters and adders, the performance of the Max-Log-MAP decoder was improved. Power of modified MAX Log-MAP decoder was achieved as 37.7569 mW. This method provided near-optimal performance with very reasonable complexity

[6] VLSI architecture for the MAP algorithm is discussed. They tried to replace RAM with registers. Memory bank was shared between two MAP decoders to get an efficient implementation of Turbo decoder. This paper extends the techniques used in the Viterbi algorithm to the MAP algorithm. They presented several techniques for the very large-scale integration (VLSI) implementation of the maximum a posteriori (MAP) algorithm. In general, knowledge about the implementation of the Viterbi algorithm could be applied to the MAP algorithm. Bounds were derived for the dynamic range of the state metrics which enable the designer to optimize the word length. The computational kernel of the algorithm was the Add-MAX operation, which is the Add- Compare-Select operation of the Viterbi algorithm with an added offset. They showed that the critical path of the algorithm could IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 8, Issue 1, Feb - March, 2020

## ISSN: 2320 – 8791 (Impact Factor: 2.317) www.ijreat.org

be reduced if the Add- MAX operation was reordered into an Offset-Add-Compare-Select operation by adjusting the location of registers. A general scheduling for the MAP algorithm was presented which gave the tradeoffs between computational complexity, latency, and memory size. They presented a survey of techniques for VLSI implementation of the MAP algorithm. As a general conclusion, the well-known results from the Viterbi algorithm literature could be applied to the MAP algorithm. The computational kernel of the MAP algorithm was very similar to that of the ACS of the Viterbi algorithm with an added offset. The analysis shows that it was better to add the offset first and then do the ACS operation in order to reduce the critical path of the circuit.

[7] A pipelined MAP decoder IC was designed in TSMC 0.18-m 1.8-V CMOS process technology. High-throughput operation was achieved via block-interleaved pipelining of the ACS kernel. The 8.7-mm chip achieves a clock frequency of 285MHz with 330mW of power consumption at 1.8-Vsupply. This core achieves the highest clock frequency reported for any 0.18- m designs and does so with the smallest area.

[8] The Noisy Channel Theorem discovered by C.E.Shannon in 1948 offered communication engineers the possibility of reducing error rates on noisy channels to negligible levels without sacrificing data rates. The primary obstacle the practical use of this theorem has been the equipment complexity and the computation time required to decode the noisy received data.

[9] An informal LDPC group has been working on the goal of achieving consensus on an optional advanced LDPC code for the OFDMA PHY. Substantial text on LDPC has been previously harmonized and included within 802.16e–this contribution completes the LDPC specification text by adding code matrices fully compliant with the existing specification text. Simulation results are also provided to show that the LDPC code (selected after an extensive feature harmonization and down selection process) offers excellent performance significantly better than convolutional codes(CC) and the same or better than the convolutional turbo codes (CTC) defined for 802.16.

[10] the rescaling scheme, at each iteration the minimum metric is subtracted from all metrics. The use of two's complement arithmetic is proposed as an alternative to the rescaling method. this scheme avoids any kind of rescaling subtractions. Obvious advantages in implementation are hardware savings and a speedup inside the metric update loop, which is critical to the decoder's computational throughput.

[11] The MAP decoder chip features a block-interleaved pipelined architecture, which enables the pipelining of the add-compare-select kernels. Measured results indicate that a turbo decoder based on the presented MAP decoder core can achieve: 1) a decoding throughput of 27.6 Mb/s with an energy-efficiency of 2.36nJ/biter; 2) the highest clock frequency compared to existing 0.18- m designs with the smallest area; and 3) comparable throughput with an area reduction of 3-4 3 with reference to a look-ahead based high-speed design (Radix-4 design), and a parallel architecture.

[12] the proposed technique in this survey also improves switching power by preventing redundant computations. A logic synthesis approach for domino/skewed logic styles based on Shannon expansion is proposed, that dynamically identifies idle parts of logic and applies clock gating to them to reduce power in the active mode of operation. Results on a set of MCNC benchmark circuits in predictive 70nm process exhibit improvements of 15% to 64% in total power with minimal overhead in terms of delay and area compared to conventionally synthesized domino/skewed logic.

[13] a deterministic clock-gating (DCG) technique which effectively reduces clock power. DCG is based on the key observation that for many of the pipelined stages of a modern processor, the circuit block usage in the near future is known a few cycles ahead of time. Our experiments show an average of 19.9% reduction in processor power with virtually no performance loss for an eight-issue, out-of-order superscalar by applying DCG to execution units, pipeline latches, D-cache word line decoders, and result bus drivers.

[14] three combinations of parallel-window (PW) and hybrid-window (HW) MAP decoding. Moreover, the computational modules and storages of the dual-mode (SB/DB) MAP decoding are designed to achieve high area utilization. To verify the proposed approaches, a 1.28mm 2 dual-mode 2PW-1HW MAP processor is implemented in 0.13  $\mu$  m CMOS process. The prototyping chip achieves a maximum throughput rate of 500 Mb/s at 125 MHz with an energy efficiency of 0.19nJ/bit and an area efficiency of 3.13 bits/mm 2. For the multi standard systems, the expected throughput rates of the WiMAX and LTE CTC schemes is achieved by using five dual-mode 2PW-1HW MAP processors.

[15] a high-speed Max-Log MAP decoder is presented for soft-in and soft-out trellis decoding. The high throughput is achieved with a two-dimensional ACS design on the high-radix trellis structure, resulting in a highly parallel

10

and area-efficient decoder. We further apply the retiming technique to reduce the critical path delay of ACS operation. After 0.13mum CMOS chip implementation, the decoder occupies 1.96mm 2area containing 220Kgates. The estimated timing under the 1.08Vsupply and the worst case corner shows that the test chip can achieve the maximum 952 MS/s throughput. To our knowledge, the present Max-Log MAP decoder has the highest throughput with the modest hardware cost.

[16] the technology trend based on three industrial technologies (90, 65, and 45 nm) using a state of the art processor as benchmark: The UltraSparc Niagara 2 from SUN Microsystem. We analyze frequency, dynamic, and static power and area after synthesis varying power supply voltage and temperature. We then compare these exhaustive analyses of system level performance as a function of technology to ITRS device level estimations. The results suggest that this prediction can be of help when addressing both the technological scaling and the variability scenario of the selected technology. We believe that correctly predicting specific values on performance variations when realistic conditions and technologies are changed could provide a valuable information for the architect. Our analysis advises the designer on the effective applicability of the ITRS trends to system performance, but also pinpoints that a reliable system level prediction should better take into account the design complexity.

[17] Standard VLSI implementations of turbo decoding require substantial memory and incur a long latency, which cannot be tolerated in some applications. A parallel VLSI architecture for low-latency turbo decoding, comprising multiple single-input single-output (SISO) elements, operating jointly on one turbo-coded block, is presented and compared to sequential architectures. A parallel interleaver is essential to process multiple concurrent SISO outputs. A novel parallel interleaver and an algorithm for its design are presented, achieving the same error correction performance as the standard architecture. Latency is reduced up to 20 times and throughput for large blocks is increased up to six-fold relative to sequential decoders, using the same silicon area, and achieving a very high coding gain. The parallel architecture scales favorably: latency and throughput are improved with increased block size and chip area.

[18] a novel multi-code turbo decoder architecture for 4G wireless systems. To support various 4G standards, a configurable multi-mode MAP (maximum a posteriori) decoder is designed for both binary and duo-binary turbo codes with small resource overhead (less than 10%)

compared to the single-mode architecture. To achieve high data rates in 4G, we present a parallel turbo decoder architecture with scalable parallelism tailored to the given throughput requirements. High-level parallelism is achieved by employing contention-free interleavers. Multibanked memory structure and routing network among memories and MAP decoders are designed to operate at full speed with parallel interleavers. We designed a very low-complexity recursive on-line address generator supporting multiple interleaving patterns, which avoids the interleaver address memory. Design trade-offs in terms of area and power efficiency are explored to find the optimal architectures. A 711 Mbps data rate is feasible with 32 Radix-4 MAP decoders running at 200 MHz clock rate.

[19] A class of deterministic interleavers for turbo codes (TCs) based on permutation polynomials over /spl Zopf//sub N/ is introduced. The main characteristic of this class of interleavers is that they can be algebraically designed to fit a given component code. Moreover, since the interleaver can be generated by a few simple computations, storage of the interleaver tables can be avoided. By using the permutation polynomial-based interleavers, the design of the interleavers reduces to the selection of the coefficients of the polynomials. It is observed that the performance of the TCs using these permutation polynomial-based interleavers is usually dominated by a subset of input weight 2m error events. The minimum distance and its multiplicity (or the first few spectrum lines) of this subset are used as design criterion to select good permutation polynomials. A simple method to enumerate these error events for small m is presented. Searches for good interleavers are performed. The decoding performance of these interleavers is close to Srandom interleavers for long frame sizes. For short frame sizes, the new interleavers outperform S-random interleavers.

[20] An interleaver is a critical component for the channel coding performance of turbo codes. Algebraic constructions are of particular interest because they admit analytical designs and simple, practical hardware implementation. Contention-free interleavers have been recently shown to be suitable for parallel decoding of turbo codes. In this correspondence, it is shown that permutation polynomials generate maximum contentionfree interleavers, i.e., every factor of the interleaver length becomes a possible degree of parallel processing of the decoder. Further, it is shown by computer simulations that turbo codes using these interleavers perform very well for the Third Generation Partnership Project (3GPP) standard. [21] We discuss a procedure for designing high-rate turbocodes via puncturing, with applications to BPSK/QPSK

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 8, Issue 1, Feb - March, 2020 ISSN: 2320 – 8791 (Impact Factor: 2.317) www.ijreat.org

channels. Rates in the form k/(k+1), 2/spl les/k/spl les/16, are considered for constituent encoders with memory sizes m=3 and 4. The algorithm includes the selection of constituent encoder generator polynomials, puncture patterns, and interleavers with the goal of maximizing the minimum codeword weight for weight-two and weight-three inputs. The performance of the proposed codes are found via computer simulation and are observed in each case to be less than 0.9 dB (m=3), and 0.75 dB (m=4), from their respective channel capacity limits at a bit-error rate of 10/sup -5/. We consider two applications of punctured turbo-codes.

[22] the use of Xilinx ChipScopetrade Pro integrated logic analyzer for Field programmable gate arrays (FPGA) as a board-and system-level diagnostic tool. FPGA designs have become increasingly dense and complex. They are difficult to debug because more and more of the relevant signals are buried deep within the logic fabric. Access to signals in the FPGA, on board or in the system is very restricted whether troubleshooting is done in the lab or in the field. The Xilinx ChipScopetrade Pro integrated logic analyzer has solved much of the problem at the FPGA level. In this work, ChipScopetrade has been used to test and verify a developed system that presents the digital core of a wireless sensor module and it has been implemented in (Spartan 3) based FPGA development board.

[23] a geometric approach to the construction of lowdensity parity-check (LDPC) codes. Four classes of LDPC codes are constructed based on the lines and points of Euclidean and projective geometries over finite fields. Codes of these four classes have good minimum distances and their Tanner (1981) graphs have girth 6. Finitegeometry LDPC codes can be decoded in various ways, ranging from low to high decoding complexity and from reasonably good to very good performance. They perform very well with iterative decoding. Furthermore, they can be put in either cyclic or quasi-cyclic form. Consequently, their encoding can be achieved in linear time and implemented with simple feedback shift registers. This advantage is not shared by other LDPC codes in general and is important in practice. Finite-geometry LDPC codes can be extended and shortened in various ways to obtain other good LDPC codes. Several techniques of extension and shortening are presented. Long extended finitegeometry LDPC codes have been constructed and they achieve a performance only a few tenths of a decibel away from the Shannon theoretical limit with iterative decoding. [24] Moore's theories about the future of transistor technology first appeared in Electronics magazine in April 1965. Termed a "law" years later by Caltech professor Carver Mead, Moore's Law went on to become a selffulfilling prophecy

[25] Parallel interleaver is an indispensable component for the parallel turbo decoder. The ever increasing data rates motivated by throughput-intensive applications result in higher parallelism of turbo decoder design, rendering the efficient realization of the interleaver a highly challenging work. This paper addresses the design of quadratic permutation polynomial (QPP) interleaver for the symbolbased serial MAP (SMAP) and the cross MAP (XMAP) parallel decoding. To circumvent the access conflicts incurred by the randomness of the interleaving function, the stratssonably position soft outputs of maximum aposteriori (MAP) decoders in memory instances is firstly investigated. On this basis, a hardware-friendly parallel interleaver architecture is proposed by utilizing the algebraic properties of QPP function. Theoretical evaluation and experimental results demonstrate that the proposed design gains boosted hardware efficiency compared to existing schemes, and is applicable to both SMAP and XMAP decoding with arbitrary parallelism.

3. Conclusion & Future Scope:

In this paper we have studied innovation for versatile correspondence. The Mobile innovation is structured as an open stage on various layers, from the physical layer up to the application. By and by, the present work is in the modules that will offer the best Speed and most minimal expense for a predefined administration utilizing at least one than one remote innovation. Another unrest in current innovation is going to start on the grounds that Upgraded Generation innovation is going to give extreme consummation to ordinary PC and PCs whose commercial center worth will be influenced. There are lots of improvements from 1G, 2G, 3G, 4G, and 5G in the world of mobile communication. The new coming 5G innovation is accessible in the market at economical rates, high pinnacle desires and much dependability than its prior advancements. 5G arrange innovation will discharge a novel age in portable correspondence. The 5G mobiles will approach distinctive remote innovations at the indistinguishable time and the terminal ought to have the option to combine various streams from various advances. 5G innovation offers high goals for energetic cell phone shopper. We can watch a HD TV direct in our cell phones with no unsettling influence. The 5G cell phones will be a tablet PC. Numerous portable LSI innovations will create.

#### **References:**

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 8, Issue 1, Feb - March, 2020

ISSN: 2320 – 8791 (Impact Factor: 2.317)

## www.ijreat.org

- S. Lee, N. R. Shanbhag and A. C. Singer, \A 285-MHz Pipelined MAP Decoder in 0.18-um CMOS," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 8, pp. 1718- 1725, 2005.
- [2] A. P. Hekstra, \An Alternative to Metric Rescaling in Viterbi Decoders," *IEEE Transactions on Communications*, vol. 37, pp. 1220-1222, November 1989.
- [3] M. J. S. Smith, \Application-Speci<sup>-</sup>c Integrated Circuits," *Pearson Education (Singapore)*, Seventh Indian Reprint, 2003.
- [4] N. Baneerjee, K. Roy, H. Mahmoodi and S. Bhunia, \Low Power Synthesis of Dynamic Logic Circuits Using Fine-Grained Clock Gating," *IEEE roceedings of Design, Automation* and Test in Europe (DATE '06), vol. 1, pp. 1-6, March 2006.
- [5] H. Li, S. Bhunia, Y. Chen, K. Roy and T. N. Vijaykumar, DCG:DeterministicClock-Gating for Low-Power Microprocessor Design," *IEEE Transactions on Very Large Scale Integrated* (VLSI) Systems, vol. 12, pp. 245-254, March 2004.
- [6] C. Lin, C. Chen and A. Wu, \Area-E±cient Scalable MAP Processor Design for High-Throughput Multistandard Convolutional Turbo Decoding," *IEEE Transactions on Very Large Scale Integrated (VLSI) Systems*, vol. 19, no. 2, pp. 305-318,2011.
- [7] C. Tang, C. Wong, C. Chen, C. Lin and H. Chang, \A 952MS/s Max-Log MAP Decoder Chip using Radix-4 x 4 ACS Architecture," *IEEE Asian Solid-State Circuits Conference (ASSCC)*, pp. 79-82, 2006.
- [8] A. Pulimeno, M. Graziano and G. Piccinini, \UDSM trends comparison: From technology roadmap to UltraSparc Niagara2," *IEEE Transactions on Very Large Scale Integrated* (VLSI) Systems, vol. 20, no. 7, pp. 1341-1346, 2012.
- [9] **Z. Navabi**, \Digital Design and Implementation with Field Programmable Devices,"*Springer*, 2005.
- [10] W. F. Lee, \Verilog Coding for Logic Synthesis," A JOHN WILEY & SONS, INC., PUBLICATION, 2003.
- [11] R. Dobkin, M.Peleg, and R.Ginosar. Parallel interleaver design and vlsi architecture for lowlatency map turbo decoders. *IEEE Trans.VLSI Syst.*, 13(4):427–438, 2005.
- [12] Y.Sun, Y.Zhu, M.Goel, and J.R.Cavallaro. Configurable and Scalable High Throughput Turbo Decoder Architecture for Multiple 4G Wireless Standards. In *IEEE International Conference on Application-Specific Systems*, *Architectures and Processors(ASAP)*, pages 209– 214, July2008.

- [13] J. Sun and O.Y.Takeshita. Interleavers for turbo codes using permutation polynomials over integer rings. *IEEE Trans.Inform.Theory*, vol.51:101– 119, Jan.2005.
- [14] **O.Y.Takeshita**.Onmaximumcontention-free interleavers and permutation polynomialsoverintegerrings. *IEEE Tran* of *Inform.Theory*, vol.52:1249–1253,Mar.2006.
- [15] O. F. Acikel and W. E. Ryan, \Punctured Turbo-Codes for BPSK/QPSK channels," *IEEE Transactions on Communications*, vol. 47, no. 9, pp. 1315-1323, 1999.
- [16] C. Hall, \Performance Analysis and Design of Punctured Turbo Codes," *Doctoral thesis:* University of Cambridge, Departemt of Engineering, 2006.
- [17] K. Arshak, E. Jafer and C. Ibala, \Testing FPGA based Digital System using XIL-INX ChipScopeTM Logic Analyzer," 29th International Spring Seminar on Electronics Technology (ISSE '06), pp. 355-360, 2006.
- [18] Quartus II Hanbook Version 13.1, Volume 1: Design and Synthesis," *ALTERA Corporation*, November 2013.
- [19] Cyclone V Device Handbook, Volume 1: Device Interfaces and Integration (Version 2013.11.12)," *ALTERA Corporation*, November 2013.
- [20] **R. G. Gallager**, \Low-Density Parity-Check Codes," *Doctoral thesis: Mas- sachusetts Institute of Technology*, 1963.
- [21] Y. Kou, S. Lin and M. P. C. Fossorier, \Low-Density Parity-Check Codes Based on Finite Geometries: A Rediscovery and New Results," *IEEE Transactions on Information Theory*, vol. 47, no. 7, pp. 2711-2736, November 2001.
- [22] G Moore, \Cramming More Components on Integrated Circuits," *Electonics Magazine*, vol. 38, no. 8, April 1965.
- [23] **IEEE 802.16e**, \LDPC Coding for OFDMA PHY," *IEEE Doc. C802-16e-05/066r3*,January 2005.
- [24] N. H. E. Weste and D. Harris, \CMOS VLSI Design: A Circuits and Systems Perspective," *Reading, MA: Pearson-Addison Welsley, 3rd* International edition, 2005.
- [25] S. Lee, C.Wang and W. Sheen, \Architecture Design of QPP Interleaver for Parallel Turbo Decoding," *Proceedings of IEEE Vehicular Technology Conference (VTC)*, pp. 1-5, 2010.